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Abstract 

A robust classification method is developed on the ba- 
sis of sparse subspace decomposition. This method tries 
to decompose a mixture of subspaces of unlabeled data 
(queries) into class subspaces as few as possible. Each 
query is classified into the class whose subspace signifi- 
cantly contributes to the decomposed subspace. Multiple 
queries from different classes can be simultaneously classi- 
fied into their respective classes. A practical greedy algo- 
rithm of the sparse subspace decomposition is designed for 
the classification. The present method achieves high recog- 
nition rate and robust performance exploiting joint sparsity. 



1. Introduction 

Classification is a task of assigning one or more class 
labels to unlabeled data (query data). A collection of la- 
beled data (training data) is available for the classification. 
The patterns or signals to be classified are usually groups of 
measurement data expressed as high-dimensional vectors. 

Depending on purposes, we need pattern classifiers that 
can answer 

• a label to each of queries, 

• a label to a set of queries, 

• a few labels to each of queries, 

• a label "invalid" to an unclassifiable query. 

We develop a framework of using subspaces for all these 
functionalities. We regard the unlabeled data as a mixture 
of subspaces. The key idea is to decompose it into the sub- 
spaces of classes as few as possible. Only the classes ex- 
plaining concisely the mixture are relevant to the unlabeled 
data. In the classification, the unlabeled data are usually 
supposed to belong to a few (typically one) classes. There- 
fore, the classification process can be interpreted as sparse 
decomposition of the subspace mixture. 



This work is inspired by the recently developing field 
of compressed sensing Q] |2l [3] 01 [5) and its innovative ap- 
plications to robust face recognition J6), action recognition 
0, computer vision and image processing |8|. The essen- 
tial idea of these works is to exploit the prior knowledge 
that a signal is sparse and compressible. The theory of 
compressed sensing is very helpful and informative for us 
to answer questions such as "How many measurements are 
enough for the pattern recognition?" and "What is the role 
of feature extraction?" It is worthy to explore the potential 
of sparse decomposition for substantial improvement of the 
subspace methods. 

The rest of this paper is organized as follows. Sec- 
tion |2 provides preliminary details and definitions of sub- 
space representation for sparse decomposition. In Section 
[3] we propose a classification method named sparse sub- 
space method, which exploits the sparseness property for 
the classification tasks described above. A practical algo- 
rithm of the sparse subspace decomposition is presented in 
Section|4] We show some tentative evaluation results of the 
sparse subspace method using a face database in Section [5] 
before concluding in Section|6] 

2. Preliminaries 

Let Sfe G ^ d ^ n k ]-, e a matrix of training dataset of fc-th 
class (k = 1, . . . , C), in which n k labeled patterns are rep- 
resented as the d-dimensional column feature vectors. We 
describe as follows the linear subspaces, their union, block 
sparsity, and sparse linear representation of a subspace. We 
also define a classification space where the sparsity should 
be encouraged. 

Linear subspaces of training datasets The class sub- 
space is defined as a vector subspace whose elements are 
the feature vectors of labeled data. We describe the sub- 
space as a vector subspace in the normed space: 

S k :=spanS fc C (R d ,l 2 ). (1) 

Sk approximates the fc-th class subspace. We denote the 
dimensionality of Sk by dim<Sfc = rank Sk- 
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Union of subspaces The union of subspaces is the sub- 
space obtained by combining the feature vectors of each 
class. 

S := U% =1 S k = span S C (R d , I 2 ) (2) 
Here, S is the concatenation of as 



S:= [S 1; ...,S C ] G 



vdxN 



(3) 



and TV := 53fe=i n k- The dimensionality of S is denoted by 
dim 5 = rank S. 

We say that the subspaces S k (k = 1, . . . , G) are inde- 
pendent if and only if any subspace Sk is not a subset of the 
union of the other subspaces, i.e., Sk <jt L& fc <Sj for Vfc. 

Linear representation of vector(s) Given sufficient 
training dataset, a c?-dimensional vector q of unlabeled 
data (hereafter "query" vector) will be approximately rep- 
resented as a linear combination of vectors from class sub- 
spaces. 



c 



q = ^2 S fc Q!fc = Sat 



(4) 



fe=i 



Here, a k G 



pnfc 7 2 



, I 2 ) is a vector of coefficients correspond- 



ing to the fc-th class, and 



a 



ot-c 



€(R N ,l 2 ) 



(5) 



is the concatenation of cy.k- 

If a set of queries is given as a matrix 



Q := L< 
then we will solve 

Here, 



q (n)] g R dx« 5 



Q = SA. 



in 



a (")l G R^X" 



A := [a 

is the matrix of unknown coefficients, and 



,0') - 
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a 



(6) 

(7) 
(8) 

(9) 



is the concatenated vector of coefficients for the j-th query. 
The matrix A can also be described as 



A 



where 



Ai 



(10) 



(11) 



The systems of linear equations as (O is called the prob- 
lem for multiple measurement vectors (MMV), while the 
case of a single measurement n = 1 as (@J is referred to 
as SMV |9] [TO] QTJ. The query vectors correspond to the 
measurements in this context. 

Uniqueness The solution a to or A to (|7]i exists if and 
only if 

q (j) € S Vj, (12) 

/.e., the queries lie on the union of class subspaces. For 
dim S < d, the solution does not always exist. The solu- 
tion may be dense even if it exists. Most components are 
nonzero despite the fact that at most n class subspaces are 
relevant to n queries. This problem is due to invalid situa- 
tion where training datasets are insufficient to identify the 
class, uniquely. 

The actual problem we should cope with is the under- 
determined case d = dim S < N, i.e., the dimensional- 
ity of the union of subspaces is less than the total num- 
ber N of training samples. Unless the training data ma- 
trices Sfc are rank-degenerated so that dim S < d, the C 
subspaces of training data cannot be independent in the d- 
dimensional space. There is an infinite number of ways 
to express the query vector by the linear combination of 
the subspace bases. The underdetermined problem requires 
regularization to select a unique solution. A sparse solution 
indicating relevant classes would be preferable. 

Block sparsity A vector £ e (R N , 1°) is called m-sparse 
if ||£||o < m - Here, || ■ ||o denotes the 1° norm, which 
counts the nonzero vector components. As the support of a 
function is the subset of its domain where it is nonzero, the 
support of a vector £ is defined as T = ^ 0}. The 1° 
norm is the cardinality of the support. 

We define a block-wise sparsity level in a similar manner 
to HIJ. Let f N be a map from VX 6 (R Nxn , l F ) to 7 e 
1°) according to a list N := {rii, . . . , nc} such that 



' Xi " 




" I|Xi||f " 






. IIXoIIjt . 



:= 7- 



(13) 



Here, X fe e (R nk xn ,l F ) is the fc-th row block of X with 
respect to Af, and || • \ \p denotes the Frobenius norm l F . 
Clearly, 



Xl 




" l|xi|| 2 " 






. I|xc|| 2 . 



(14) 



for n = 1. A vector x 6 (R , / ) is called block M -sparse 
over A/" if Xfe 7^ for at most M indices k. The block 
sparsity is measured as 



l|x||o,AT := II/a/-(x)||o- 



(15) 



That is, || • | |o,jV counts the number of nonzero blocks. 
We measure the row block sparsity of a matrix X £ 
(R Nxn ,l F ) over Af as 

||X|| 0) Ar:=||/Ar(X)||o. (16) 

A matrix X is row block M -sparse if | |X| |o,aA < M. 

We remark that the row block M-sparse matrix X can 
be converted into a block M -sparse vector vec(X T ). Here, 
the operator vec transforms a matrix into a column vec- 
tor by stacking all the columns of the matrix. For Af := 
{ni, . . . , nc} and Af' :~ {nni, . . . , nnc}, the block spar- 
sity of X £ R Nxn over Af is preserved as 

||X|| ,Af= ||vec(X T )|| . J v-'. (17) 

Sparse representation of subspace In the underdeter- 
mined case, the columns of matrix S £ R dxN represent 
an overcomplete basis of R d for d < N. Equation (0]i and 
(|7]i can be consistent with infinitely many solutions a. and 
A, respectively. 

We denote the subspace of query vector(s) by Q = 
spanq or spanQ. If a possible solution a or A is block 
sparse over Af = {rii, . . . , nc}, the query subspace Q con- 
sists of a small minority of class subspaces corresponding 
to nonzero at or A^. In other words, the query subspace is 
sparsely represented by the class subspaces. The sparsity of 
the subspace representation can be quantified as |c*| |o.Af or 

H A Ho,A/-- 

Classification space By definition, the block sparsity 
||o!||o,JV ° r ||A||o.aT is measured by the 1° norm of the C- 
dimensional vector 7 := frt(a) or /V(A). The compo- 
nents of 7 imply the degrees of class membership. The 
sparser 7 is, the more certainly the class label of each 
query is identified. The sparsity is properly measured by 
the 1° norm. Therefore, we refer to the normed space 
C = (M.9, 1°), where 7 resides, as the classification space. 

3. Classification based on sparse subspace rep- 
resentation 

From the viewpoint of classification, each query vector 
is supposed to be composed only of vectors from the sub- 
space of a class to which the query is classified. The sub- 
space spanned by the query vectors should be represented as 
sparsely as possible by the class subspaces concerned with 
the queries. In our notation, the C-dimensional vector in 
the classification space, 7 := f^(a.) or f^(A), is intended 
to be sparsest. The sparsity is properly measured by the 1° 
norm of 7. Therefore, we incorporate minimization of the 
1° norm in the classification framework. 



3.1. Formulation 

Let S £ R dxN be the concatenation of S fe £ R dx " fc 
(fc = 1, . . . , C, d = rankS < N = J2k=i n k)> i e > the 
matrices of training datasets. Given the matrix Q £ M. dxn 
of n query vectors, we solve the /"-minimization problem: 

min||A||oA^ subject to Q = SA. (18) 

A 

Here, Af specifies the sizes of row blocks for sparsification. 
Typically, Af = {ni, ... ,nc}. The matrix A is released 
from being row-block sparse if TV = Afi '■= {V?i; = 1, i = 
I,. ..,N} = {1,. ..,!}. 

One can rewrite the problem (fT8l as 

min || vec(A T )|| w subject to 

A 

vec(Q T ) = (S ® I„) vec(A T ) (19) 

where (g> denotes the Kronecker product, and I n is the iden- 
tity matrix of size n. The list Af' defines the block sizes of 
the nTV-dimensional vector vec(A T ). 

The ^-minimization problem (fT9l is well investigated in 
the literature ifTTI . The uniqueness of the solution is guar- 
anteed under the condition called block restricted isometry 
property (block RIP). Assuming qy> £ S, the RIP condi- 
tion for our problem can be described as 

(1- <W')H V ll2 

< ||(S®In)v||! 

< (1 + Mil VveM" Ar .(20) 

where <5m|a/"' is called the block-RIP constant dependent on 
the block sparsity M over Af'. In practice, we normalize 
the blocks Sfc in order for the matrix S ® I„ to satisfy the 
condition. The block RIP condition is less stringent than the 
standard RIP condition, which is widely used in the field of 
compressed sensing JT] |2] E] H] |5] . 

3.2. Dimensionality reduction 

In (1181) . we assume the linear system Q = SA to be 
underdetermined as d = rank S < N, and regularize it by 
the 1° minimization. Actually, we do not have to deal with 
the queries and training data in a space of dimension d > 
N. The recent works in the emerging area of compressed 
sensing show that a small number of projections of a sparse 
vector can contain its salient information enough to recover 
the vector with regularization that promotes sparsity 0] [5] 
FPU . The statements in IT3l [T4l guaranteeing the recovery 
are described as follows. 

Theorem 1 Let x := \& T s be a d- dimensional vector rep- 
resented by a m-sparse vector s £ M. d using a basis ^ T £ 
M. dxd . Then, s can be reconstructed from a d-dimensional 
vector x := <frx with probability 1 — e~°( d \ Here, $ £ 
M. dxd is a random matrix and d > do := 0(m\og(d/m)). 



Specially, d > 2m\og(d/d) holds if m < d 03][l6). It 
is also possible to recover the sparse vector s from a small 
number of projections, x, with overwhelming probability in 
more general case where 3? and \& are incoherent |fl5l [TTl 

a. 

The reconstructability in Theorem Q] suggests that one 
can obtain the <i-dimensional 7«-sparse solution from a 
much lower d-dimensional vector after linear transforma- 
tion. Wright et al. |6) showed, in their framework of face 
recognition based on sparse representation, that the compu- 
tational cost is reduced without significant loss of recogni- 
tion rate by linear transformations into lower dimensional 
feature spaces, such as Eigenfaces, Fisherfaces, Lapla- 
cianfaces, downsampling, and random projection. These 
transformations act as dimensionality reduction that pre- 
serves information for the recognition. Especially, ran- 
dom projection is a data-independent dimensionality reduc- 
tion technique, and one can exactly recover the original d- 
dimensional vector. For this reason, we employ the dimen- 
sionality reduction if d is too high for computation. 

3.3. Classifiers 

?i-to-one classifier Since the minimizer A for ( TT~8b is a 
row block M -sparse matrix, the M blocks indicate the 
Mc (Mc < M) classes concerned with the query sub- 
spaces. For the task of classifying all n queries into one 
class (Mc = 1), we calculate the residuals of the repre- 
sentations by the class subspaces. 

r fe (Q;A) := ||Q - S fe A fc || F . (21) 

The residuals quantify the dissimilarities between the query 
subspace and the class subspaces. Note that most of the 
residuals are ||Q||_f because of the sparsity. If the query 
subspace Q can be approximately represented by one of the 
class subspaces, the class label is identified as 

argmin r fe (Q; A). (22) 

k 

This classification method achieves the same task as the mu- 
tual subspace methods lfl8l [191 l20l in a fundamentally dif- 
ferent strategy. The mutual subspace methods are robust 
owing to the multiple queries. The robustness is further en- 
hanced by the block sparsification in our scheme. The 1° 
minimization in ( fT~8b encourages the vector of class mem- 
bership degrees, fj^-(A), to be as sparse as possible in the 
classification space. For the underdetermined problem with 
a sparse solution, the recent works in the emerging area of 
compressed sensing 0~l|2][3]|4l P rove the exact recovery un- 
der the 1° or I 1 regularization. Since the 1° 1 1 1 minimizer 
is very insensitive to outliers, the sparse representation is 
robust compared to the conventional representations by l 2 - 
based regularization e.g. PCA. 



We also remark that if n = 1 and AT = A/"i, our 
n-to-one classification is exactly the same as the sparse 
representation-based classification (SRC) proposed in |6J. 
Our classification based on sparse subspace representation 
is therefore an extension of the SRC for multiple queries. 

?i-to-ones classifier It is also possible to classify n 
queries into their respective classes. We calculate C x n 
residual matrix whose kj-th entry measures the dissimilar- 
ity between the j-th query and its reconstruction in the fc-th 
subspace: 

ri J ' ) (Q;A):=||qW-S fc a^|| 2 . (23) 

Note that most of the residual entries are | |q' J ) | |a because of 
the sparsity. If the query subspace Q can be approximately 
represented by union of a small number of class subspaces, 
the class label for the j-th query is identified as 

argmin r[ 3) (Q; A). (24) 

k 

Again, our method is expected to be robust owing to 
the multiple queries. Furthermore, the classes irrelevant 
to the queries are strongly excluded by the 1° minimiza- 
tion. Therefore, the classifier ( 124-b can detect the respective 
class for each query without giving the number of relevant 
classes. 

n-to-M classifier Let us mention the potential of the 
sparse subspace representation for finding n-to-M rela- 
tions, although we do not go into the detail of this type of 
multiple classification in this paper. If a query simultane- 
ously belongs to multiple classes, the query vector is rep- 
resented as a linear combination of vectors from the sub- 
spaces of the relevant classes. The residuals r£ for such 

query cannot be zero, but the relevant classes are found by 

(i) 

thresholding r k . Thus, each of n queries is assigned to 
some of M classes. 

Classification validity A classifier should answer "in- 
valid" if the given query belongs to an unknown class. As 
suggested in |6|, such an unclassifiable query is perceived 
to be so by measuring how the nonzero components of A 
concentrate on a single class. Wright et al. defined the spar- 
sity concentration index (SCI), which quantifies the validity 
of the classification [ 6 |. One may compute the SCI for each 
column of A to validate the corresponding query. 

3.4. Sparse subspace method 

Our classification method based on the sparse subspace 
representation is summarized in Algorithm Q] 



Algorithm 1 Sparse subspace method (SSM) 

Input: Q e M. dxn : matrix of n queries as ©, S 6 
R dxJV : concatenated matrix of training datasets as (O, 
M: list of row block sizes; 

Output: C: set of class labels; 

1 perform dimensionality reduction of Q and S if d is 
intractably high; 

2 normalize the columns of S to have unit I 2 norm; 

3 decompose Q with respect to S to obtain the sparse sub- 
space representation. 

4 find the class label C = {arg min*. rj. (Q; A)} or 

£ = {arg miiife r[} ] (Q; A), . . . , arg min fc r[ n) (Q; A)}. 



The major concern is the sparse subspace decomposition 
of Q at Step [3] In the next section, we present a practi- 
cal algorithm of the decomposition, SSD-ROMP, which ef- 
ficiently and stably provides approximate solution to ( fT8l >. 

4. Sparse subspace decomposition 

The sparse decomposition of Q in (fT~8T > is considered as 
a MMV problem whose solution is row-block sparse. The 
solution has two important characteristics: the column vec- 
tors of A share nonzero blocks as their support, and 
the block partitions are fixed by N in advance. 

4.1. Prior work on MMV 

Configuration of the nonzero entries in the solution A 
is called the joint sparsity model (JSM) 12T1 |22l . There 
are some prior works on the MMV problems with sev- 
eral JSMs ||9] [H ED E2 E3 Q3] El. Most of them 
EHOlEDEllESlES focus on a JSM in which the column 
vectors oS^ simply share their support T. This JSM is the 
special case of our row -block sparsity model with Af = M\ 
described in Section lXTI Efficient algorithms for the MMV 
problem with this JSM have been designed as the extensions 
of greedy algorithms such as matching pursuit (MP) and 
orthogonal matching pursuit (OMP) ||26l W\ |28l |29l l30l . 
OMP is an efficient algorithm that can recover a m-sparse 
vector from a 0(m log iV)-dimensional vector l30l . It it- 
eratively selects the basis (column of A) with the largest 
contribution to the current residual to reduce greedily the 
representation error at each iteration. The existing MP- 
and OMP-based algorithms for the MMV problem can be 
directly used for our problem with the row-block sparsity 
model only when N = Afi . 

Eldar and Mishali ifTTl introduced the block sparsity 
model and block RIP condition applicable to MMV prob- 
lems including ours. The uniqueness was guaranteed in l3.ll 
By I 1 convex relaxation, we can cast the vectorized version 



in ( fT9b as 

min || vec(A T )||! j^i subject to 

A 

vec(Q T ) = (S ® I„) vec(A T ). (25) 

Here, we redefine fa as a map from CR Nxn , l F ) to (R^, I 1 ) 
in the same form as ( TT3l l. and define [j 

IIAIImtHIMXJIIi. (26) 

According to IfTTl . this I 1 minimization problem is a second 
order cone problem (SOCP). 

4.2. Sub-optimal algorithm 



Algorithm 2 Sparse subspace decomposition (SSD-ROMP) 

Input: Q e R dxn : matrix of n queries as ©, S 6 
M. dxN : concatenated matrix of training datasets as ([3J, 
N: list of row block sizes, Mq\ sparsity level; 

Output: A: row-block sparse matrix as ([Tol l, I: set of in- 
dices of nonzero blocks; 

1 let the index set T := and residual R := Q; 

2 repeat 

3 U := S T R; 

4 7 := MU); 

5 let J be a set of indices of the Mq biggest com- 
ponents of 7, or all of its nonzero components, 
whichever set is smaller; 

6 sort J in descending order of the components 7; 

7 among all subsets Jq c J such that 7$ < 27^ for 
all i < j £ Jo, choose Jo with the maximal energy 

IItUII! := ^ 

k&Jo 

8 I := 2 U Jo; 

9 for each j do 

10 := argmin Hq^- 1 — SfcQ:||2; 

fcei 

1 1 end for 

12 R := Q S * A ^ 

hex 

13 until |R| | F = or cardX > 2M . 



We present a practical greedy algorithm of the block 
sparse decomposition. Although there are optimization 
packages that solve the SOCP in polynomial time, we prefer 
a simple and efficient algorithm of the sparse recovery like 
the MP and OMP. As compared with the signal recovery in 
compressed sensing, approximate solutions may be enough 
for the classification purpose. Since the sparsity level is at 
most 0(n) for n queries, we want the decomposition algo- 
rithm to work efficiently in the case of extreme sparseness. 

'The norm || ■ 112,1 defined in 1111 is the same as our || ■ 1 1 1 and it 
is actually the I 1 norm through fj^f as we defined. 
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Figure 1. Examples of one-to-one classification. The first, second, and third columns respectively show the query images, residuals of the 
representations by SSM with respect to 38 sujects, and those by SRC. (a) A valid query image of subject #16. (b) The same query image 
as (a) with 30% pixels corrupted by salt and pepper noise, (c) An invalid image from unlearned face database. 



We adopt the regularized OMP (ROMP) J3T] because it 
can stably provide approximate solution from noisy queries. 
We modify the ROMP to seek for the nonzero row blocks 
of the solution as shown in Algorithm [2] This algorithm 
selects multiple row-blocks of S T SA that have comparable 
magnitudes measured by fjy at each iteration. Note that the 
algorithm requires the additional parameter Mo = O(M) 
although the solution is insensitive to this parameter. 

Intensive computations are the matrix multiplication at 
Step [3] and the least squares problem at Step [10] which 
cost 0(nNd) and 0(nM^jd) time, respectively. The cost 
of least squares problem can be reduced to 0(nM^d) by 
the conjugate gradient (CG) method as suggested in |3~T1 . 
The total running time of Algorithm [2] is 0(nM^Nd) or 
O(nM Nd) using CG. 

5. Experiments 

We demonstrate our sparse subspace method (SSM) de- 
scribed in AlgorithmQ] We perform face recognition exper- 
iments using a cropped version of the Extended Yale Face 
Database B 11321 1331 . The database consists of 2,414 face 
images of 38 individuals. We randomly select half of the 



images of each subject for the training dataset (rik ~ 32, 
k = 1, . . . , 38), and the other half for queries. Each image 
is expressed as a d ~ 192 x 168 = 32,256 dimensional 
vector storing the grayscale values. 



One-to-one classification Figure Q] shows examples of 
one-to-one classification. The SSM tries to answer a class 
label for a single query. We reduced the dimensionality to 
d = 1,024 by the Gaussian random projection at Step[T]in 
Algorithm Q] We set the block sizes N = {n-y, . . . , 113$} 
and the sparsity level Mq = 4 in Algorithmic Since SSM 
behaves as the SRC |6 | when N = Wi, we also executed 
the SRC implemented with ROMP. The SSM and SRC, in- 
cluding the random projection, run in less than 0.2 seconds 
on a moderate workstation. 

For the valid query image of subject #16 as Fig. [TJa), 
we see that only the residual ri$ is significantly small. The 
SSM and SRC stably detect ri6 as the smallest even if the 
query is contaminated with noise as shown in Fig |TJb) be- 
fore the dimensionality reduction. We also observe in Fig 
Qlc) that none of the residuals can be significantly small 
for the invalid query (taken from the UMIST face database 



l34l ). In all cases, the residuals tend to be left undisturbed 
in SSM although the classification results are the same as 
SRC. This indicates that irrelevant class subspaces are ruled 
out by the block sparse model. 

n-to-one classification For different numbers n of 
queries, we evaluated the recognition rate of ?i-to-one clas- 
sifier with respect to reduced feature dimension d by Gaus- 
sian random projection. For d > 120, the recognition 
rate increases with n and d as shown in Fig. [2] The rate 
is enhanced to more than 99% at d > 350 with n > 4 
queries. The perfect classification is achieved at d > 400 
with n > 8 queries. The n-to-one classifier provides better 
performance than the one-to-one classifier applied to each 
query, because the n-to-one classifier takes advantage of the 
joint sparsity. However, the SSM did not improve the recog- 
nition rate at low dimensions d < 120. We should cope with 
this matter in the future work. 

?i-to-ones classification We also performed the n-to-ones 
classification. Figure [3] shows an example using the Ex- 
tended Yale Face Database B. We gave the classifier five 
query images, three of which are taken from subject # 5 and 
two from # 29. These five queries are classified into their 
respective classes indicated by the significantly small resid- 
uals. 




° 100 200 300 400 500 

Feature dimension 

Figure 2. Recognition rates of n-to-one classifier on Extended Yale 
B database, with respect to feature dimension. 



6. Concluding remarks 

We have developed the sparse subspace method (SSM), 
which enables us to classify multiple queries into their re- 
spective classes, simultaneously. The SSM is based on the 
sparse decomposition of the query subspace. The query 
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Figure 3. An example of n-to-ones classification. Residuals 
r fc 1 ' i • ■ ■ i r fc 5 ' are shown from top to bottom. Each of n = 5 
queries is classified into one of two classes k = 5 and 29. 



subspace is represented only by the relevant class sub- 
spaces. Since this sparse decomposition can be cast as 
the MMV problem with a row-block joint sparsity model, 
the uniqueness, robustness and recovery of the solution are 
guaranteed under the block RIP condition. We realized the 
block sparse decomposition by modifying the greedy algo- 
rithm ROMR We experimentally showed that the classifica- 
tion of multiple queries improves the recognition rate on a 
face database. The joint sparsity model and the decompo- 
sition algorithm should be improved further. More detailed 
performance evaluation also remains in the future work. 
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